04 March, 2025
Welcome to this course on creating interactive maps using the leaflet package.
Leaflet is a popular open-source JavaScript library for creating mobile-friendly, interactive maps.
Thanks to the work of many leading R developers, we can create interactive maps with just a few lines of R code.
You may already be familiar with leaflet maps, as they are used by leading technology companies, nonprofits, and government organizations to create informative and interactive maps.
This map enables users to explore parks and monuments by state.
For example, Maine, has one National Park, one International Park, and one National Monument. Clicking on any of these locations reveals a pop-up with additional information.
This leaflet map plots all four-year colleges in America and color codes these institutions by sector to indicate if they are public, private, or for-profit.
Using the control panel in the upper right-hand corner, we can toggle the map between different base maps and we can select which sectors of colleges appear on the map.
Later in the lecture, we will add a few pieces of flair to this map, such as labels that appear when hovering and the ability to search for a particular college.
leaflet builds maps using tiles.
Tiled web maps join many map images together and when a user zooms or pans your interactive leaflet map, new tiles are fetched as needed to provide the requested view of the map.
Let’s take a look at how this works in R.
library(leaflet)
leaflet() %>%
addTiles()
First, we load the leaflet library.
Then we initialize an html widget with the leaflet() function call.
library(leaflet)
leaflet() %>%
addTiles()
You’ll notice that leaflet leverages the pipe operator that is common in the tidyverse.
For example, we can pipe the result of our ‘leaflet(); call into the ’addTiles()’ function to create an interactive map with just two-lines of R code!
For now, you will practice using different map tiles while working towards creating an interactive map that displays two of college’s locations.
We will use the leaflet package to initialize an HTML widget and add a map tile using addTiles().
Steps:
Load the leaflet library.
Call the leaflet() function to initialize the map.
Pipe the output into addTiles() to add default map tiles.
Run and explore the interactive map.
# Load the leaflet library
library(leaflet)
# Create a leaflet map with a default map tile using addTiles()
leaflet() %>%
addTiles()
As you work through the exercises, I encourage you to experiment with different base maps to expand your awareness of the available options.
When you are selecting a base map there are several important questions to consider.
Perhaps, primary among them is “Why are you making this map in the first place?”
Is this map just for your use or is it part of a larger project that should fit within an existing design framework?
Secondly, “what type of data will you be plotting?”
Will the geographic and topographic features of the base map add to the information you are presenting or confuse your users?
In my work, I tend to prefer grayscale maps when plotting data.
I find that these maps make it easier for me to distinguish between the data that I am plotting and the data included with the base map.
There are over 100 provider tiles included in the leaflet package.
Most of these tiles you can use by calling the \(addProviderTiles()\) function.
However, there are a few, like mapbox, that you will need to register for prior to using them.
You can access the names of the provider tiles included in the leaflet package by calling the names() function on the providers list.
For example, to see the first five provider tiles, we call names() on the providers list followed by 1 colon 5 in brackets.
names(providers)[1:5]
[1] "OpenStreetMap" "OpenStreetMap.Mapnik" "OpenStreetMap.DE" [4] "OpenStreetMap.CH" "OpenStreetMap.France"
The first five tiles are all OpenStreetMap tiles, so it might be more useful to print all of the tiles provided by OpenStreetMap, which you can do by using the \(str\_detect()\) function from the stringr package.
names(providers)[str_detect(names(providers), "OpenStreetMap")]
[1] "OpenStreetMap" "OpenStreetMap.Mapnik" "OpenStreetMap.DE" [4] "OpenStreetMap.CH" "OpenStreetMap.France" "OpenStreetMap.HOT" [7] "OpenStreetMap.BZH"
For example, to create a leaflet map that uses the black and white OpenStreetMap, we replace the \(addTiles()\) with \(addProviderTiles()\) and pass in the name of the desired tile to the function.
leaflet() %>%
addProviderTiles("OpenStreetMap.BlackAndWhite")
addTiles() to add the default OpenStreetMap (OSM) tile.providers list.addProviderTiles() instead of addTiles().# Print the providers list included in the leaflet library providers
$OpenStreetMap [1] "OpenStreetMap" $OpenStreetMap.Mapnik [1] "OpenStreetMap.Mapnik" $OpenStreetMap.DE [1] "OpenStreetMap.DE" $OpenStreetMap.CH [1] "OpenStreetMap.CH" $OpenStreetMap.France [1] "OpenStreetMap.France" $OpenStreetMap.HOT [1] "OpenStreetMap.HOT" $OpenStreetMap.BZH [1] "OpenStreetMap.BZH" $MapTilesAPI [1] "MapTilesAPI" $MapTilesAPI.OSMEnglish [1] "MapTilesAPI.OSMEnglish" $MapTilesAPI.OSMFrancais [1] "MapTilesAPI.OSMFrancais" $MapTilesAPI.OSMEspagnol [1] "MapTilesAPI.OSMEspagnol" $OpenSeaMap [1] "OpenSeaMap" $OPNVKarte [1] "OPNVKarte" $OpenTopoMap [1] "OpenTopoMap" $OpenRailwayMap [1] "OpenRailwayMap" $OpenFireMap [1] "OpenFireMap" $SafeCast [1] "SafeCast" $Stadia [1] "Stadia" $Stadia.AlidadeSmooth [1] "Stadia.AlidadeSmooth" $Stadia.AlidadeSmoothDark [1] "Stadia.AlidadeSmoothDark" $Stadia.OSMBright [1] "Stadia.OSMBright" $Stadia.Outdoors [1] "Stadia.Outdoors" $Stadia.StamenToner [1] "Stadia.StamenToner" $Stadia.StamenTonerBackground [1] "Stadia.StamenTonerBackground" $Stadia.StamenTonerLines [1] "Stadia.StamenTonerLines" $Stadia.StamenTonerLabels [1] "Stadia.StamenTonerLabels" $Stadia.StamenTonerLite [1] "Stadia.StamenTonerLite" $Stadia.StamenWatercolor [1] "Stadia.StamenWatercolor" $Stadia.StamenTerrain [1] "Stadia.StamenTerrain" $Stadia.StamenTerrainBackground [1] "Stadia.StamenTerrainBackground" $Stadia.StamenTerrainLabels [1] "Stadia.StamenTerrainLabels" $Stadia.StamenTerrainLines [1] "Stadia.StamenTerrainLines" $Thunderforest [1] "Thunderforest" $Thunderforest.OpenCycleMap [1] "Thunderforest.OpenCycleMap" $Thunderforest.Transport [1] "Thunderforest.Transport" $Thunderforest.TransportDark [1] "Thunderforest.TransportDark" $Thunderforest.SpinalMap [1] "Thunderforest.SpinalMap" $Thunderforest.Landscape [1] "Thunderforest.Landscape" $Thunderforest.Outdoors [1] "Thunderforest.Outdoors" $Thunderforest.Pioneer [1] "Thunderforest.Pioneer" $Thunderforest.MobileAtlas [1] "Thunderforest.MobileAtlas" $Thunderforest.Neighbourhood [1] "Thunderforest.Neighbourhood" $CyclOSM [1] "CyclOSM" $Jawg [1] "Jawg" $Jawg.Streets [1] "Jawg.Streets" $Jawg.Terrain [1] "Jawg.Terrain" $Jawg.Sunny [1] "Jawg.Sunny" $Jawg.Dark [1] "Jawg.Dark" $Jawg.Light [1] "Jawg.Light" $Jawg.Matrix [1] "Jawg.Matrix" $MapBox [1] "MapBox" $MapTiler [1] "MapTiler" $MapTiler.Streets [1] "MapTiler.Streets" $MapTiler.Basic [1] "MapTiler.Basic" $MapTiler.Bright [1] "MapTiler.Bright" $MapTiler.Pastel [1] "MapTiler.Pastel" $MapTiler.Positron [1] "MapTiler.Positron" $MapTiler.Hybrid [1] "MapTiler.Hybrid" $MapTiler.Toner [1] "MapTiler.Toner" $MapTiler.Topo [1] "MapTiler.Topo" $MapTiler.Voyager [1] "MapTiler.Voyager" $TomTom [1] "TomTom" $TomTom.Basic [1] "TomTom.Basic" $TomTom.Hybrid [1] "TomTom.Hybrid" $TomTom.Labels [1] "TomTom.Labels" $Esri [1] "Esri" $Esri.WorldStreetMap [1] "Esri.WorldStreetMap" $Esri.DeLorme [1] "Esri.DeLorme" $Esri.WorldTopoMap [1] "Esri.WorldTopoMap" $Esri.WorldImagery [1] "Esri.WorldImagery" $Esri.WorldTerrain [1] "Esri.WorldTerrain" $Esri.WorldShadedRelief [1] "Esri.WorldShadedRelief" $Esri.WorldPhysical [1] "Esri.WorldPhysical" $Esri.OceanBasemap [1] "Esri.OceanBasemap" $Esri.NatGeoWorldMap [1] "Esri.NatGeoWorldMap" $Esri.WorldGrayCanvas [1] "Esri.WorldGrayCanvas" $OpenWeatherMap [1] "OpenWeatherMap" $OpenWeatherMap.Clouds [1] "OpenWeatherMap.Clouds" $OpenWeatherMap.CloudsClassic [1] "OpenWeatherMap.CloudsClassic" $OpenWeatherMap.Precipitation [1] "OpenWeatherMap.Precipitation" $OpenWeatherMap.PrecipitationClassic [1] "OpenWeatherMap.PrecipitationClassic" $OpenWeatherMap.Rain [1] "OpenWeatherMap.Rain" $OpenWeatherMap.RainClassic [1] "OpenWeatherMap.RainClassic" $OpenWeatherMap.Pressure [1] "OpenWeatherMap.Pressure" $OpenWeatherMap.PressureContour [1] "OpenWeatherMap.PressureContour" $OpenWeatherMap.Wind [1] "OpenWeatherMap.Wind" $OpenWeatherMap.Temperature [1] "OpenWeatherMap.Temperature" $OpenWeatherMap.Snow [1] "OpenWeatherMap.Snow" $HERE [1] "HERE" $HERE.normalDay [1] "HERE.normalDay" $HERE.normalDayCustom [1] "HERE.normalDayCustom" $HERE.normalDayGrey [1] "HERE.normalDayGrey" $HERE.normalDayMobile [1] "HERE.normalDayMobile" $HERE.normalDayGreyMobile [1] "HERE.normalDayGreyMobile" $HERE.normalDayTransit [1] "HERE.normalDayTransit" $HERE.normalDayTransitMobile [1] "HERE.normalDayTransitMobile" $HERE.normalDayTraffic [1] "HERE.normalDayTraffic" $HERE.normalNight [1] "HERE.normalNight" $HERE.normalNightMobile [1] "HERE.normalNightMobile" $HERE.normalNightGrey [1] "HERE.normalNightGrey" $HERE.normalNightGreyMobile [1] "HERE.normalNightGreyMobile" $HERE.normalNightTransit [1] "HERE.normalNightTransit" $HERE.normalNightTransitMobile [1] "HERE.normalNightTransitMobile" $HERE.reducedDay [1] "HERE.reducedDay" $HERE.reducedNight [1] "HERE.reducedNight" $HERE.basicMap [1] "HERE.basicMap" $HERE.mapLabels [1] "HERE.mapLabels" $HERE.trafficFlow [1] "HERE.trafficFlow" $HERE.carnavDayGrey [1] "HERE.carnavDayGrey" $HERE.hybridDay [1] "HERE.hybridDay" $HERE.hybridDayMobile [1] "HERE.hybridDayMobile" $HERE.hybridDayTransit [1] "HERE.hybridDayTransit" $HERE.hybridDayGrey [1] "HERE.hybridDayGrey" $HERE.hybridDayTraffic [1] "HERE.hybridDayTraffic" $HERE.pedestrianDay [1] "HERE.pedestrianDay" $HERE.pedestrianNight [1] "HERE.pedestrianNight" $HERE.satelliteDay [1] "HERE.satelliteDay" $HERE.terrainDay [1] "HERE.terrainDay" $HERE.terrainDayMobile [1] "HERE.terrainDayMobile" $HEREv3 [1] "HEREv3" $HEREv3.normalDay [1] "HEREv3.normalDay" $HEREv3.normalDayCustom [1] "HEREv3.normalDayCustom" $HEREv3.normalDayGrey [1] "HEREv3.normalDayGrey" $HEREv3.normalDayMobile [1] "HEREv3.normalDayMobile" $HEREv3.normalDayGreyMobile [1] "HEREv3.normalDayGreyMobile" $HEREv3.normalDayTransit [1] "HEREv3.normalDayTransit" $HEREv3.normalDayTransitMobile [1] "HEREv3.normalDayTransitMobile" $HEREv3.normalNight [1] "HEREv3.normalNight" $HEREv3.normalNightMobile [1] "HEREv3.normalNightMobile" $HEREv3.normalNightGrey [1] "HEREv3.normalNightGrey" $HEREv3.normalNightGreyMobile [1] "HEREv3.normalNightGreyMobile" $HEREv3.normalNightTransit [1] "HEREv3.normalNightTransit" $HEREv3.normalNightTransitMobile [1] "HEREv3.normalNightTransitMobile" $HEREv3.reducedDay [1] "HEREv3.reducedDay" $HEREv3.reducedNight [1] "HEREv3.reducedNight" $HEREv3.basicMap [1] "HEREv3.basicMap" $HEREv3.mapLabels [1] "HEREv3.mapLabels" $HEREv3.trafficFlow [1] "HEREv3.trafficFlow" $HEREv3.carnavDayGrey [1] "HEREv3.carnavDayGrey" $HEREv3.hybridDay [1] "HEREv3.hybridDay" $HEREv3.hybridDayMobile [1] "HEREv3.hybridDayMobile" $HEREv3.hybridDayTransit [1] "HEREv3.hybridDayTransit" $HEREv3.hybridDayGrey [1] "HEREv3.hybridDayGrey" $HEREv3.pedestrianDay [1] "HEREv3.pedestrianDay" $HEREv3.pedestrianNight [1] "HEREv3.pedestrianNight" $HEREv3.satelliteDay [1] "HEREv3.satelliteDay" $HEREv3.terrainDay [1] "HEREv3.terrainDay" $HEREv3.terrainDayMobile [1] "HEREv3.terrainDayMobile" $FreeMapSK [1] "FreeMapSK" $MtbMap [1] "MtbMap" $CartoDB [1] "CartoDB" $CartoDB.Positron [1] "CartoDB.Positron" $CartoDB.PositronNoLabels [1] "CartoDB.PositronNoLabels" $CartoDB.PositronOnlyLabels [1] "CartoDB.PositronOnlyLabels" $CartoDB.DarkMatter [1] "CartoDB.DarkMatter" $CartoDB.DarkMatterNoLabels [1] "CartoDB.DarkMatterNoLabels" $CartoDB.DarkMatterOnlyLabels [1] "CartoDB.DarkMatterOnlyLabels" $CartoDB.Voyager [1] "CartoDB.Voyager" $CartoDB.VoyagerNoLabels [1] "CartoDB.VoyagerNoLabels" $CartoDB.VoyagerOnlyLabels [1] "CartoDB.VoyagerOnlyLabels" $CartoDB.VoyagerLabelsUnder [1] "CartoDB.VoyagerLabelsUnder" $HikeBike [1] "HikeBike" $HikeBike.HikeBike [1] "HikeBike.HikeBike" $HikeBike.HillShading [1] "HikeBike.HillShading" $BasemapAT [1] "BasemapAT" $BasemapAT.basemap [1] "BasemapAT.basemap" $BasemapAT.grau [1] "BasemapAT.grau" $BasemapAT.overlay [1] "BasemapAT.overlay" $BasemapAT.terrain [1] "BasemapAT.terrain" $BasemapAT.surface [1] "BasemapAT.surface" $BasemapAT.highdpi [1] "BasemapAT.highdpi" $BasemapAT.orthofoto [1] "BasemapAT.orthofoto" $nlmaps [1] "nlmaps" $nlmaps.standaard [1] "nlmaps.standaard" $nlmaps.pastel [1] "nlmaps.pastel" $nlmaps.grijs [1] "nlmaps.grijs" $nlmaps.water [1] "nlmaps.water" $nlmaps.luchtfoto [1] "nlmaps.luchtfoto" $NASAGIBS [1] "NASAGIBS" $NASAGIBS.ModisTerraTrueColorCR [1] "NASAGIBS.ModisTerraTrueColorCR" $NASAGIBS.ModisTerraBands367CR [1] "NASAGIBS.ModisTerraBands367CR" $NASAGIBS.ViirsEarthAtNight2012 [1] "NASAGIBS.ViirsEarthAtNight2012" $NASAGIBS.ModisTerraLSTDay [1] "NASAGIBS.ModisTerraLSTDay" $NASAGIBS.ModisTerraSnowCover [1] "NASAGIBS.ModisTerraSnowCover" $NASAGIBS.ModisTerraAOD [1] "NASAGIBS.ModisTerraAOD" $NASAGIBS.ModisTerraChlorophyll [1] "NASAGIBS.ModisTerraChlorophyll" $NLS [1] "NLS" $JusticeMap [1] "JusticeMap" $JusticeMap.income [1] "JusticeMap.income" $JusticeMap.americanIndian [1] "JusticeMap.americanIndian" $JusticeMap.asian [1] "JusticeMap.asian" $JusticeMap.black [1] "JusticeMap.black" $JusticeMap.hispanic [1] "JusticeMap.hispanic" $JusticeMap.multi [1] "JusticeMap.multi" $JusticeMap.nonWhite [1] "JusticeMap.nonWhite" $JusticeMap.white [1] "JusticeMap.white" $JusticeMap.plurality [1] "JusticeMap.plurality" $GeoportailFrance [1] "GeoportailFrance" $GeoportailFrance.plan [1] "GeoportailFrance.plan" $GeoportailFrance.parcels [1] "GeoportailFrance.parcels" $GeoportailFrance.orthos [1] "GeoportailFrance.orthos" $OneMapSG [1] "OneMapSG" $OneMapSG.Default [1] "OneMapSG.Default" $OneMapSG.Night [1] "OneMapSG.Night" $OneMapSG.Original [1] "OneMapSG.Original" $OneMapSG.Grey [1] "OneMapSG.Grey" $OneMapSG.LandLot [1] "OneMapSG.LandLot" $USGS [1] "USGS" $USGS.USTopo [1] "USGS.USTopo" $USGS.USImagery [1] "USGS.USImagery" $USGS.USImageryTopo [1] "USGS.USImageryTopo" $WaymarkedTrails [1] "WaymarkedTrails" $WaymarkedTrails.hiking [1] "WaymarkedTrails.hiking" $WaymarkedTrails.cycling [1] "WaymarkedTrails.cycling" $WaymarkedTrails.mtb [1] "WaymarkedTrails.mtb" $WaymarkedTrails.slopes [1] "WaymarkedTrails.slopes" $WaymarkedTrails.riding [1] "WaymarkedTrails.riding" $WaymarkedTrails.skating [1] "WaymarkedTrails.skating" $OpenAIP [1] "OpenAIP" $OpenSnowMap [1] "OpenSnowMap" $OpenSnowMap.pistes [1] "OpenSnowMap.pistes" $AzureMaps [1] "AzureMaps" $AzureMaps.MicrosoftImagery [1] "AzureMaps.MicrosoftImagery" $AzureMaps.MicrosoftBaseDarkGrey [1] "AzureMaps.MicrosoftBaseDarkGrey" $AzureMaps.MicrosoftBaseRoad [1] "AzureMaps.MicrosoftBaseRoad" $AzureMaps.MicrosoftBaseHybridRoad [1] "AzureMaps.MicrosoftBaseHybridRoad" $AzureMaps.MicrosoftTerraMain [1] "AzureMaps.MicrosoftTerraMain" $AzureMaps.MicrosoftWeatherInfraredMain [1] "AzureMaps.MicrosoftWeatherInfraredMain" $AzureMaps.MicrosoftWeatherRadarMain [1] "AzureMaps.MicrosoftWeatherRadarMain" $SwissFederalGeoportal [1] "SwissFederalGeoportal" $SwissFederalGeoportal.NationalMapColor [1] "SwissFederalGeoportal.NationalMapColor" $SwissFederalGeoportal.NationalMapGrey [1] "SwissFederalGeoportal.NationalMapGrey" $SwissFederalGeoportal.SWISSIMAGE [1] "SwissFederalGeoportal.SWISSIMAGE"
# Print only the names of the map tiles in the providers list names(providers)
[1] "OpenStreetMap" [2] "OpenStreetMap.Mapnik" [3] "OpenStreetMap.DE" [4] "OpenStreetMap.CH" [5] "OpenStreetMap.France" [6] "OpenStreetMap.HOT" [7] "OpenStreetMap.BZH" [8] "MapTilesAPI" [9] "MapTilesAPI.OSMEnglish" [10] "MapTilesAPI.OSMFrancais" [11] "MapTilesAPI.OSMEspagnol" [12] "OpenSeaMap" [13] "OPNVKarte" [14] "OpenTopoMap" [15] "OpenRailwayMap" [16] "OpenFireMap" [17] "SafeCast" [18] "Stadia" [19] "Stadia.AlidadeSmooth" [20] "Stadia.AlidadeSmoothDark" [21] "Stadia.OSMBright" [22] "Stadia.Outdoors" [23] "Stadia.StamenToner" [24] "Stadia.StamenTonerBackground" [25] "Stadia.StamenTonerLines" [26] "Stadia.StamenTonerLabels" [27] "Stadia.StamenTonerLite" [28] "Stadia.StamenWatercolor" [29] "Stadia.StamenTerrain" [30] "Stadia.StamenTerrainBackground" [31] "Stadia.StamenTerrainLabels" [32] "Stadia.StamenTerrainLines" [33] "Thunderforest" [34] "Thunderforest.OpenCycleMap" [35] "Thunderforest.Transport" [36] "Thunderforest.TransportDark" [37] "Thunderforest.SpinalMap" [38] "Thunderforest.Landscape" [39] "Thunderforest.Outdoors" [40] "Thunderforest.Pioneer" [41] "Thunderforest.MobileAtlas" [42] "Thunderforest.Neighbourhood" [43] "CyclOSM" [44] "Jawg" [45] "Jawg.Streets" [46] "Jawg.Terrain" [47] "Jawg.Sunny" [48] "Jawg.Dark" [49] "Jawg.Light" [50] "Jawg.Matrix" [51] "MapBox" [52] "MapTiler" [53] "MapTiler.Streets" [54] "MapTiler.Basic" [55] "MapTiler.Bright" [56] "MapTiler.Pastel" [57] "MapTiler.Positron" [58] "MapTiler.Hybrid" [59] "MapTiler.Toner" [60] "MapTiler.Topo" [61] "MapTiler.Voyager" [62] "TomTom" [63] "TomTom.Basic" [64] "TomTom.Hybrid" [65] "TomTom.Labels" [66] "Esri" [67] "Esri.WorldStreetMap" [68] "Esri.DeLorme" [69] "Esri.WorldTopoMap" [70] "Esri.WorldImagery" [71] "Esri.WorldTerrain" [72] "Esri.WorldShadedRelief" [73] "Esri.WorldPhysical" [74] "Esri.OceanBasemap" [75] "Esri.NatGeoWorldMap" [76] "Esri.WorldGrayCanvas" [77] "OpenWeatherMap" [78] "OpenWeatherMap.Clouds" [79] "OpenWeatherMap.CloudsClassic" [80] "OpenWeatherMap.Precipitation" [81] "OpenWeatherMap.PrecipitationClassic" [82] "OpenWeatherMap.Rain" [83] "OpenWeatherMap.RainClassic" [84] "OpenWeatherMap.Pressure" [85] "OpenWeatherMap.PressureContour" [86] "OpenWeatherMap.Wind" [87] "OpenWeatherMap.Temperature" [88] "OpenWeatherMap.Snow" [89] "HERE" [90] "HERE.normalDay" [91] "HERE.normalDayCustom" [92] "HERE.normalDayGrey" [93] "HERE.normalDayMobile" [94] "HERE.normalDayGreyMobile" [95] "HERE.normalDayTransit" [96] "HERE.normalDayTransitMobile" [97] "HERE.normalDayTraffic" [98] "HERE.normalNight" [99] "HERE.normalNightMobile" [100] "HERE.normalNightGrey" [101] "HERE.normalNightGreyMobile" [102] "HERE.normalNightTransit" [103] "HERE.normalNightTransitMobile" [104] "HERE.reducedDay" [105] "HERE.reducedNight" [106] "HERE.basicMap" [107] "HERE.mapLabels" [108] "HERE.trafficFlow" [109] "HERE.carnavDayGrey" [110] "HERE.hybridDay" [111] "HERE.hybridDayMobile" [112] "HERE.hybridDayTransit" [113] "HERE.hybridDayGrey" [114] "HERE.hybridDayTraffic" [115] "HERE.pedestrianDay" [116] "HERE.pedestrianNight" [117] "HERE.satelliteDay" [118] "HERE.terrainDay" [119] "HERE.terrainDayMobile" [120] "HEREv3" [121] "HEREv3.normalDay" [122] "HEREv3.normalDayCustom" [123] "HEREv3.normalDayGrey" [124] "HEREv3.normalDayMobile" [125] "HEREv3.normalDayGreyMobile" [126] "HEREv3.normalDayTransit" [127] "HEREv3.normalDayTransitMobile" [128] "HEREv3.normalNight" [129] "HEREv3.normalNightMobile" [130] "HEREv3.normalNightGrey" [131] "HEREv3.normalNightGreyMobile" [132] "HEREv3.normalNightTransit" [133] "HEREv3.normalNightTransitMobile" [134] "HEREv3.reducedDay" [135] "HEREv3.reducedNight" [136] "HEREv3.basicMap" [137] "HEREv3.mapLabels" [138] "HEREv3.trafficFlow" [139] "HEREv3.carnavDayGrey" [140] "HEREv3.hybridDay" [141] "HEREv3.hybridDayMobile" [142] "HEREv3.hybridDayTransit" [143] "HEREv3.hybridDayGrey" [144] "HEREv3.pedestrianDay" [145] "HEREv3.pedestrianNight" [146] "HEREv3.satelliteDay" [147] "HEREv3.terrainDay" [148] "HEREv3.terrainDayMobile" [149] "FreeMapSK" [150] "MtbMap" [151] "CartoDB" [152] "CartoDB.Positron" [153] "CartoDB.PositronNoLabels" [154] "CartoDB.PositronOnlyLabels" [155] "CartoDB.DarkMatter" [156] "CartoDB.DarkMatterNoLabels" [157] "CartoDB.DarkMatterOnlyLabels" [158] "CartoDB.Voyager" [159] "CartoDB.VoyagerNoLabels" [160] "CartoDB.VoyagerOnlyLabels" [161] "CartoDB.VoyagerLabelsUnder" [162] "HikeBike" [163] "HikeBike.HikeBike" [164] "HikeBike.HillShading" [165] "BasemapAT" [166] "BasemapAT.basemap" [167] "BasemapAT.grau" [168] "BasemapAT.overlay" [169] "BasemapAT.terrain" [170] "BasemapAT.surface" [171] "BasemapAT.highdpi" [172] "BasemapAT.orthofoto" [173] "nlmaps" [174] "nlmaps.standaard" [175] "nlmaps.pastel" [176] "nlmaps.grijs" [177] "nlmaps.water" [178] "nlmaps.luchtfoto" [179] "NASAGIBS" [180] "NASAGIBS.ModisTerraTrueColorCR" [181] "NASAGIBS.ModisTerraBands367CR" [182] "NASAGIBS.ViirsEarthAtNight2012" [183] "NASAGIBS.ModisTerraLSTDay" [184] "NASAGIBS.ModisTerraSnowCover" [185] "NASAGIBS.ModisTerraAOD" [186] "NASAGIBS.ModisTerraChlorophyll" [187] "NLS" [188] "JusticeMap" [189] "JusticeMap.income" [190] "JusticeMap.americanIndian" [191] "JusticeMap.asian" [192] "JusticeMap.black" [193] "JusticeMap.hispanic" [194] "JusticeMap.multi" [195] "JusticeMap.nonWhite" [196] "JusticeMap.white" [197] "JusticeMap.plurality" [198] "GeoportailFrance" [199] "GeoportailFrance.plan" [200] "GeoportailFrance.parcels" [201] "GeoportailFrance.orthos" [202] "OneMapSG" [203] "OneMapSG.Default" [204] "OneMapSG.Night" [205] "OneMapSG.Original" [206] "OneMapSG.Grey" [207] "OneMapSG.LandLot" [208] "USGS" [209] "USGS.USTopo" [210] "USGS.USImagery" [211] "USGS.USImageryTopo" [212] "WaymarkedTrails" [213] "WaymarkedTrails.hiking" [214] "WaymarkedTrails.cycling" [215] "WaymarkedTrails.mtb" [216] "WaymarkedTrails.slopes" [217] "WaymarkedTrails.riding" [218] "WaymarkedTrails.skating" [219] "OpenAIP" [220] "OpenSnowMap" [221] "OpenSnowMap.pistes" [222] "AzureMaps" [223] "AzureMaps.MicrosoftImagery" [224] "AzureMaps.MicrosoftBaseDarkGrey" [225] "AzureMaps.MicrosoftBaseRoad" [226] "AzureMaps.MicrosoftBaseHybridRoad" [227] "AzureMaps.MicrosoftTerraMain" [228] "AzureMaps.MicrosoftWeatherInfraredMain" [229] "AzureMaps.MicrosoftWeatherRadarMain" [230] "SwissFederalGeoportal" [231] "SwissFederalGeoportal.NationalMapColor" [232] "SwissFederalGeoportal.NationalMapGrey" [233] "SwissFederalGeoportal.SWISSIMAGE"
library(stringr) # Use str_detect() to find provider names containing 'CartoDB' str_detect(names(providers), "CartoDB")
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [61] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [73] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [85] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [97] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [109] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [121] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [145] FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE [157] TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [169] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [181] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [193] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [205] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [217] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [229] FALSE FALSE FALSE FALSE FALSE
library(stringr) # Display provider tile names that include 'CartoDB' names(providers)[str_detect(names(providers), "CartoDB")]
[1] "CartoDB" "CartoDB.Positron" [3] "CartoDB.PositronNoLabels" "CartoDB.PositronOnlyLabels" [5] "CartoDB.DarkMatter" "CartoDB.DarkMatterNoLabels" [7] "CartoDB.DarkMatterOnlyLabels" "CartoDB.Voyager" [9] "CartoDB.VoyagerNoLabels" "CartoDB.VoyagerOnlyLabels" [11] "CartoDB.VoyagerLabelsUnder"
Instead of the default OSM tile, we can use CartoDB or other tiles.
The function addProviderTiles() allows us to specify a provider from the list.
library(stringr)
library(leaflet)
# Change addTiles() to addProviderTiles() and set provider to 'CartoDB'
leaflet() %>%
addProviderTiles(provider = "CartoDB")
Now, that we have got the different providers, let’s try one of them and see how it’s different than the default one.
The first argument to addProviderTiles() is your leaflet map, which allows us to pipe leaflet() output directly into addProviderTiles().
The second argument is provider, which accepts any of the map tiles included in the providers list.
leaflet() %>%
addProviderTiles(provider = "CartoDB.DarkMatterNoLabels")
I’ve used the ‘CartoDB.DarkMatterNoLabels’ provider. And it has turned the map into dark mode actually!
You may have noticed that, by default, maps are zoomed out to the farthest level.
Rather than manually zooming and panning, we can load the map centered on a particular point using the setView() function.
The default zoom level is 0 and can reach upto 19. 0 being the zoomed out stage.
leaflet() %>%
addProviderTiles("CartoDB") %>%
setView(lat = 27.1751, lng = 78.0421, zoom = 16)
We can limit users’ ability to pan away from the map’s focus using the options argument in the leaflet() function.
By setting minZoom and dragging, we can create an interactive web map that will always be focused on a specific area.
Although, user can zoom out using controls.
leaflet(options = leafletOptions(minZoom = 14, dragging = FALSE)) %>%
addProviderTiles("CartoDB") %>%
setView(lng = 78.0421, lat = 27.1751, zoom = 16)
Alternatively, if we want our users to be able to drag the map while ensuring that they do not stray too far, we can set the maps maximum boundaries by specifying two diagonal corners of a rectangle.
library(tibble)
## Warning: package 'tibble' was built under R version 4.0.5 R package that
## provides easy to use functions for creating tibbles, which is a modern
## rethinking of data frames.
wonders <- tibble(place = c("Taj Mahal - India", "Petra - Jordan", "Christ the Redeemer - Brazil",
"Colosseum - Italy"), lon = c(78.0421, 35.4444, 43.2105, 12.4922), lat = c(27.1751,
30.3285, 22.9519, 41.8902))
leaflet(options = leafletOptions(
# Set minZoom and dragging
minZoom = 12, dragging = TRUE)) %>%
addProviderTiles("CartoDB") %>%
# Set default zoom level
setView(lng = wonders$lon[2], lat = wonders$lat[2], zoom = 10) %>%
# Set max bounds of map
setMaxBounds(lng1 = wonders$lon[2] + .05,
lat1 = wonders$lat[2] + .05,
lng2 = wonders$lon[2] - .05,
lat2 = wonders$lat[2] - .05)
Try, dragging this map. What do you notice.
It cannot be dragged more than the max bounds we have set i.e. 0.05.
So, the map stays in the focused mode and cannot be dragged further than set limits.
So far we have been creating maps with a single layer: a base map.
We can add layers to this base map similar to how you add layers to a plot in ggplot2.
One of the most common layers to add to a leaflet map is location markers, which you can add by piping the result of addTiles() or addProviderTiles() into the add markers function.
For example, if we plot Taj Mahal by passing the coordinates to addMarkers() as numeric vectors with one element, our web map will place a blue drop pin at the coordinate.
leaflet() %>%
addProviderTiles("OpenStreetMap") %>%
addMarkers(lng = wonders$lon[2], lat = wonders$lat[2])
To make our map more informative we can add popups.
To add popups that appear when a marker is clicked we need to specify the popup argument in the addMarkers() function.
Once we have a map we would like to preserve, we can store it in an object.
Then we can pipe this object into functions to add or edit the map’s layers.
wondersMap <- leaflet() %>%
addTiles() %>%
addMarkers(lng = wonders$lon, lat = wonders$lat, popup = wonders$place)
We can add layers to the existing leaflet R object
map_zoom <- wondersMap %>%
setView(lng = wonders$lon[4], lat = wonders$lat[4], zoom = 2)
map_zoom
If you are storing leaflet maps in objects, there will come a time when you need to remove markers or reset the view.
You can accomplish these tasks with the following functions.
clearMarkers()- Remove one or more features from a map
clearBounds()- Clear bounds and automatically determine bounds based on map elements
To remove the markers and to reset the bounds of our m map we would:
map_zoom <- map_zoom %>%
addMarkers(lng = wonders$lon, lat = wonders$lat) %>%
setView(lng = 20.6843, lat = 88.5678, zoom = 5)
map_zoom %>%
clearMarkers() %>%
clearBounds()
library(leaflet)
my_map <- leaflet() %>%
addTiles() %>%
addMarkers(lat = 39.2980803, lng = -76.5898801, popup = "Jeff Leek's Office")
my_map
Adding one marker at a time is often not practical if you want to display many markers.
If you have a data frame with columns lat and lng you can pipe that data frame into leaflet() to add all the points at once.
set.seed(2016 - 4 - 25)
df <- data.frame(lat = runif(20, min = 39.2, max = 39.3), lng = runif(20, min = -76.6,
max = -76.5))
df %>%
leaflet() %>%
addTiles() %>%
addMarkers()
The blue markers that leaflet comes packaged with may not be enough depending on what you’re mapping.
Thankfully you can make your own markers from .png files.
unytIcon <- makeIcon(iconUrl = "https://upload.wikimedia.org/wikipedia/en/thumb/8/85/Logo_of_the_University_of_New_York%E2%80%93Tirana.svg/150px-Logo_of_the_University_of_New_York%E2%80%93Tirana.svg.png",
iconWidth = 31 * 215/230, iconHeight = 31, iconAnchorX = 31 * 215/230/2, iconAnchorY = 16)
unytLatLong <- data.frame( lat = c(41.3275), # Latitude of UNYT lng = c(19.8189) # Longitude of UNYT ) unytLatLong %>% leaflet() %>% addTiles() %>% addMarkers(icon = unytIcon)
When adding multiple markers to a map, you may want to add popups for each marker.
You can specify a string of plain text for each popup, or you can provide HTML which will be rendered inside of each popup.
unytSites <- c("<a href='https://unyt.edu.al/'>University of New York Tirana</a>",
"<a href='https://unyt.edu.al/academics/'>UNYT Academics</a>", "<a href='https://unyt.edu.al/research/'>UNYT Research Center</a>",
"<a href='https://unyt.edu.al/admissions/'>UNYT Admissions</a>", "<a href='https://unyt.edu.al/contact/'>UNYT Contact</a>")
unytLatLong <- data.frame( lat = c(41.3275, 41.3280, 41.3268, 41.3272, 41.3282), # Approximate locations lng = c(19.8189, 19.8195, 19.8178, 19.8201, 19.8182) # Adjusted for different locations )
unytIcon <- makeIcon(iconUrl = "https://upload.wikimedia.org/wikipedia/en/thumb/8/85/Logo_of_the_University_of_New_York%E2%80%93Tirana.svg/150px-Logo_of_the_University_of_New_York%E2%80%93Tirana.svg.png",
iconWidth = 31 * 215/230, iconHeight = 31, iconAnchorX = 31 * 215/230/2, iconAnchorY = 16)
unytLatLong %>%
leaflet() %>%
addTiles() %>%
addMarkers(icon = unytIcon, popup = unytSites)
Sometimes you might have so many points on a map that it doesn’t make sense to plot every marker.
In these situations leaflet allows you to plot clusters of markers using addMarkers(clusterOptions = markerClusterOptions()).
When you zoom in to each cluster, the clusters will separate until you can see the individual markers.
# Generate 20 random points around the University of New York Tirana df <- data.frame( lat = runif(20, min = 41.315, max = 41.335), # Latitude range for Tirana lng = runif(20, min = 19.805, max = 19.830) # Longitude range for Tirana )
# Create a leaflet map with circle markers
df %>%
leaflet() %>%
addTiles() %>%
addCircleMarkers(
radius = 5, # Size of the circles
color = "blue", # Outline color
fillColor = "lightblue", # Fill color
fillOpacity = 0.5, # Transparency level
stroke = TRUE
)
Instead of adding markers or clusters you can easily add circle markers using addCircleMarkers().
# Generate 500 random points around Tirana, Albania df <- data.frame( lat = runif(500, min = 41.315, max = 41.335), # Latitude range for Tirana lng = runif(500, min = 19.805, max = 19.830) # Longitude range for Tirana )
# Create a leaflet map with clustered markers
df %>%
leaflet() %>%
addTiles() %>%
addMarkers(clusterOptions = markerClusterOptions())
You can draw arbitrary shapes on the maps you create, including circles and squares.
The code below draws a map where the circle on each city is proportional to the population of that city.
# Data for major cities in Albania with estimated population
albania_cities <- data.frame(
name = c("Tirana", "Durres", "Vlore", "Shkoder", "Fier",
"Korce", "Berat", "Elbasan", "Lushnje", "Gjirokaster"),
pop = c(418495, 201110, 130827, 135612, 120655,
75694, 60331, 141714, 83274, 25000), # Approximate population values
lat = c(41.3275, 41.3167, 40.4667, 42.0667, 40.7250,
40.6167, 40.7050, 41.1125, 40.9333, 40.0783),
lng = c(19.8189, 19.4500, 19.4897, 19.5156, 19.5561,
20.7667, 19.9522, 20.0822, 19.7050, 20.1333)
)
# Create a leaflet map with circle sizes representing city population
albania_cities %>%
leaflet() %>%
addTiles() %>%
addCircles(
weight = 1,
radius = sqrt(albania_cities$pop) * 30, # Adjusted radius to scale with population
color = "blue", # Outline color
fillColor = "lightblue", # Fill color
fillOpacity = 0.5, # Transparency
popup = ~paste0("<b>", name, "</b><br>Population: ", pop) # Popup with city name & population
)
You can add rectangles on leaflet maps as well:
# Create a leaflet map with a rectangle in Tirana
leaflet() %>%
addTiles() %>%
addRectangles(
lat1 = 41.3220, lng1 = 19.8100, # Bottom-left corner of rectangle
lat2 = 41.3300, lng2 = 19.8250, # Top-right corner of rectangle
color = "black", weight = 2, fillOpacity = 0.3
)
# Generate 20 random points around Tirana with different colors
df <- data.frame(
lat = runif(20, min = 41.315, max = 41.335), # Latitude range for Tirana
lng = runif(20, min = 19.805, max = 19.830), # Longitude range for Tirana
col = sample(c("red", "blue", "green"), 20, replace = TRUE),
stringsAsFactors = FALSE
)
# Create a leaflet map with circle markers and a legend
df %>%
leaflet() %>%
addTiles() %>%
addCircleMarkers(color = df$col, fillOpacity = 0.7, radius = 5) %>%
addLegend(position = "bottomright", labels = c("Category A", "Category B", "Category C"),
colors = c("blue", "red", "green"), title = "Legend for Categories")
Adding a legend can be useful if you have markers on your map with different colors:
# Generate 20 random points around Tirana with different colors
df <- data.frame(
lat = runif(20, min = 41.315, max = 41.335), # Latitude range for Tirana
lng = runif(20, min = 19.805, max = 19.830), # Longitude range for Tirana
col = sample(c("red", "blue", "green"), 20, replace = TRUE),
stringsAsFactors = FALSE
)
# Create a leaflet map with circle markers and a legend
df %>%
leaflet() %>%
addTiles() %>%
addCircleMarkers(color = df$col, fillOpacity = 0.7, radius = 5) %>%
addLegend(position = "bottomright", labels = c("Category A", "Category B", "Category C"),
colors = c("blue", "red", "green"), title = "Legend for Categories")
The plotly R package provides an interface to the plotly JavaScript graphing library, allowing you to create interactive web-based graphics entirely in R.
plotly is a great choice for creating interactive graphics because you can create a wide variety of interactive graphics in multiple formats.
For example, you can execute your code in the console and interact with your graphic entirely in the viewer pane, or you could deploy your graphic to the web as a shiny app.
plotly is also backed by a strong community and is still under heavy development, making it a great time to learn how to harness its power.
As of November 2018, plotly downloads were an order of magnitude higher than its competitors like rbokeh and highcharter.
Before you start creating graphics, it’s important to think carefully about what type of graphic best suits your purpose: a static graphic, or an interactive graphic.
To highlight features of each type of graphic, let’s consider a scatterplot of proline against flavonoids, two chemicals found in wine.
A static plot, such as one rendered in ggplot2, remains permanently fixed.
This format is useful for printed materials such as reports, but can only display what you, the creator, have highlighted.
On the other hand, the user can update an interactive graphic.
For example, you can drill down to specific observations using hover info, or focus on subsets of your data by selecting or deselecting groups.
Simple interactions can improve your ability to explore your data, and throughout this course, you’ll learn how to add these to your graphics toolkit.
To begin, consider the wine dataset from the UCI Machine Learning Repository, containing the results of a chemical analysis of 178 wines all grown in the same region in Italy, but derived from three different cultivars.
We’ll begin by converting the static scatterplot of proline against flavanoids we saw earlier to a plotly interactive graphic.
# Install and load the rattle package install.packages('rattle')
library(rattle)
# Load the wine dataset
data(wine)
# Display the first few rows of the dataset
glimpse(wine)
Rows: 178 Columns: 14 $ Type <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, … $ Alcohol <dbl> 14.23, 13.20, 13.16, 14.37, 13.24, 14.20, 14.39, 14.06… $ Malic <dbl> 1.71, 1.78, 2.36, 1.95, 2.59, 1.76, 1.87, 2.15, 1.64, … $ Ash <dbl> 2.43, 2.14, 2.67, 2.50, 2.87, 2.45, 2.45, 2.61, 2.17, … $ Alcalinity <dbl> 15.6, 11.2, 18.6, 16.8, 21.0, 15.2, 14.6, 17.6, 14.0, … $ Magnesium <int> 127, 100, 101, 113, 118, 112, 96, 121, 97, 98, 105, 95… $ Phenols <dbl> 2.80, 2.65, 2.80, 3.85, 2.80, 3.27, 2.50, 2.60, 2.80, … $ Flavanoids <dbl> 3.06, 2.76, 3.24, 3.49, 2.69, 3.39, 2.52, 2.51, 2.98, … $ Nonflavanoids <dbl> 0.28, 0.26, 0.30, 0.24, 0.39, 0.34, 0.30, 0.31, 0.29, … $ Proanthocyanins <dbl> 2.29, 1.28, 2.81, 2.18, 1.82, 1.97, 1.98, 1.25, 1.98, … $ Color <dbl> 5.64, 4.38, 5.68, 7.80, 4.32, 6.75, 5.25, 5.05, 5.20, … $ Hue <dbl> 1.04, 1.05, 1.03, 0.86, 1.04, 1.05, 1.02, 1.06, 1.08, … $ Dilution <dbl> 3.92, 3.40, 3.17, 3.45, 2.93, 2.85, 3.58, 3.58, 2.85, … $ Proline <int> 1065, 1050, 1185, 1480, 735, 1450, 1290, 1295, 1045, 1…
Remember that there are three parts to a ggplot graphic: First, we have the dataset.
static <- wine %>%
ggplot(aes(x = Flavanoids, y = Proline, color = Type)) + geom_point()
static
The command ggplotly() allows you to convert a ggplot graphic to a plotly interactive graphic in a single line of code.
library(plotly) ggplotly(static)
After loading the plotly package, pass the static ggplot object to the ggplotly() command, and an interactive version is created.
You learned how to convert your static ggplot2 plots into interactive plotly charts.
Not all ggplot objects can be converted to plotly objects, and sometimes you want more control over how your graphics are rendered.
In this lesson, we’ll explore how to create histograms and bar charts using plotly.
As first examples of univariate graphics, we’ll explore the distribution of the wine types and phenols using the wine dataset.
# Install and load the rattle package install.packages('rattle')
library(rattle)
# Load the wine dataset
data(wine)
# Display the first few rows of the dataset
glimpse(wine)
Rows: 178 Columns: 14 $ Type <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, … $ Alcohol <dbl> 14.23, 13.20, 13.16, 14.37, 13.24, 14.20, 14.39, 14.06… $ Malic <dbl> 1.71, 1.78, 2.36, 1.95, 2.59, 1.76, 1.87, 2.15, 1.64, … $ Ash <dbl> 2.43, 2.14, 2.67, 2.50, 2.87, 2.45, 2.45, 2.61, 2.17, … $ Alcalinity <dbl> 15.6, 11.2, 18.6, 16.8, 21.0, 15.2, 14.6, 17.6, 14.0, … $ Magnesium <int> 127, 100, 101, 113, 118, 112, 96, 121, 97, 98, 105, 95… $ Phenols <dbl> 2.80, 2.65, 2.80, 3.85, 2.80, 3.27, 2.50, 2.60, 2.80, … $ Flavanoids <dbl> 3.06, 2.76, 3.24, 3.49, 2.69, 3.39, 2.52, 2.51, 2.98, … $ Nonflavanoids <dbl> 0.28, 0.26, 0.30, 0.24, 0.39, 0.34, 0.30, 0.31, 0.29, … $ Proanthocyanins <dbl> 2.29, 1.28, 2.81, 2.18, 1.82, 1.97, 1.98, 1.25, 1.98, … $ Color <dbl> 5.64, 4.38, 5.68, 7.80, 4.32, 6.75, 5.25, 5.05, 5.20, … $ Hue <dbl> 1.04, 1.05, 1.03, 0.86, 1.04, 1.05, 1.02, 1.06, 1.08, … $ Dilution <dbl> 3.92, 3.40, 3.17, 3.45, 2.93, 2.85, 3.58, 3.58, 2.85, … $ Proline <int> 1065, 1050, 1185, 1480, 735, 1450, 1290, 1295, 1045, 1…
To explore the distribution of wine type, a categorical variable, we use a bar chart displaying the number of wines of each type.
# Install and load the rattle package install.packages('rattle')
library(plotly)
wine %>%
count(Type) %>%
plot_ly(x = ~Type, y = ~n) %>%
add_bars()
There are three fundamental parts to a plotly graphic: First, we have the dataset. Here we calculate a frequency table giving the number of wines of each type using the count() command.
We then pass this summarized dataset to the plot underscore ly command, which creates our base layer, similar to the ggplot() function.
The second piece is the mapping of the variables in the dataset to aesthetics in the graph.
Here, we specify the mappings using tildes, x = ~Type, y = ~n, telling the plot which variable defines each aesthetic.
Third, we specify the plot type by adding a trace, similar to how we add a geometry in ggplot2.
To create a bar chart we add the pipe operator, %>%, after the plot underscore ly base layer and specify add underscore bars.
With only three wine types our bar chart was easy enough to read; however, with more categories bar charts can become difficult to read unless the bars are sorted.
For example, we may wish to rearrange the bars in descending order.
To do this we use the fct underscore reorder command found in the forcats package.
library(forcats)
wine %>%
count(Type) %>%
mutate(Type = fct_reorder(Type, n, .desc = TRUE)) %>%
plot_ly(x = ~Type, y = ~n) %>%
add_bars()
To sort the bars in descending order we add a single line of code to our data-plot pipeline: Mutate creates a new variable, Type, and fct underscore reorder reorders the levels of Type by the values in n.Â
To organize the levels in descending order, we add the argument dot-desc = TRUE.
To explore the distribution of phenols, a numeric variable, we use a histogram displaying the number of wines with phenols falling into equal-width bins.
We again need to specify three parts: First, we pipe the wine dataset into the plot underscore ly command.
wine %>%
plot_ly(x = ~Phenols) %>%
add_histogram()
Next, we specify \(x = \sim Phenols\), indicating that Phenols should be plotted on the x-axis.
Finally, we add the histogram trace using the add underscore histogram command.
Notice that we do not need to specify a variable for the y-axis here since plotly calculates the frequency for each bin in the background.
Whenever you create a histogram, it’s important to explore different binning schemes, since bins that are too wide may mask interesting features of the data, and bins that are too small provide little insight.
There are two ways to adjust the binning scheme in plotly.
The first is to change the number of bins displayed by adding the nbinsx argument to the add_histogram() command.
Here we specify that 10 bins should be displayed.
wine %>%
plot_ly(x = ~Phenols) %>%
add_histogram(nbinsx = 10)
The second way to change the binning is to specify the exact values for the bins.
Here, we specify xbins equals list parenthesis start = point-8 , end = 4, and size = point-25, resulting in bins of width 0-point-25 spanning from 0-point-8 to 4.
wine %>%
plot_ly(x = ~Phenols) %>%
add_histogram(xbins = list(start = 0.8, end = 4, size = 0.25))
If you specify the exact values for the bins, be sure to look at a summary of the variable first so that you choose a logical start and end value.
Now, we extend your plotly toolkit to include bivariate graphics.
Specifically, you will learn how to explore associations using scatterplots, stacked bar charts, and boxplots.
Scatterplots allow us to explore the relationship between two numeric variables, such as the residual sugar and fixed acidity in wine.
As before, we begin by piping our dataset into the plot_ly() command.
Data we will use:
winequality <- read.csv("https://raw.githubusercontent.com/endri81/DataVisualization/refs/heads/main/data/winequality.csv",
sep = ",")
wine <- read.csv("https://raw.githubusercontent.com/endri81/DataVisualization/refs/heads/main/data/wine.csv",
sep = ",")
winequality %>%
plot_ly(x = ~residual_sugar, y = ~fixed_acidity) %>%
add_markers()
Next, we specify that residual sugar should be mapped to the x-axis and fixed acidity should be mapped to the y-axis.
Finally, we add a markers trace to plot a point for each ordered pair.
Stacked bar charts allow you to explore associations between two categorical variables.
For example, we can explore how the type of wine is related to the quality label.
To begin, we count the number of wines for each combination of type and quality label.
Next, we map variables to the x-axis, y-axis, and color of the segments.
In this example, we map type to the x-axis, n to the y-axis, and quality label to color.
winequality %>%
count(type, quality_label) %>%
plot_ly(x = ~type, y = ~n, color = ~quality_label) %>%
add_bars() %>%
layout(barmode = "stack")
We add the bars trace as before, but we have to refine the layout in order to create a stacked bar chart since the bars plot side-by-side by default.
To stack the bars, we modify the layout of the bar chart by specifying barmode = “stack”.
The stacked bar chart of the counts we just created is useful for comparing the total number of high, medium, and low-quality wines across type.
If, however, we are interested in comparing the distribution of quality between red and white wines, it may be more useful to plot proportions on the y-axis.
Changing from a stacked bar chart of counts to a stacked bar chart of proportions is a data manipulation problem, which is solved in two lines of code.
First, we group the table of counts by the x-variable, which is type in our example.
Next, we use mutate to calculate the proportions within each group, and store it in the prop column.
winequality %>%
count(type, quality_label) %>%
group_by(type) %>%
mutate(prop = n/sum(n)) %>%
plot_ly(x = ~type, y = ~prop, color = ~quality_label) %>%
add_bars() %>%
layout(barmode = "stack")
So far we’ve talked about exploring associations between either two numeric or two categorical variables.
Boxplots are one way to explore how the distribution of a numeric variable may change based on the level of a categorical variable.
For example, here we see that the distribution of alcohol content is positively associated with wine quality: higher quality wines tend to have higher alcohol content. Of course, there is substantial variability.
To create this set of side-by-side boxplots we map the quality label to the x-axis, alcohol to the y-axis, and add a boxplot trace.
winequality %>%
plot_ly(x = ~quality_label, y = ~alcohol) %>%
add_boxplot()
You’ve seen how to create a variety of common graphics using plotly.
Now, you’ll learn how to style and customize your graphics so that they are ready to publish. To begin, you’ll learn how to style traces.
First, let’s consider how to change the color of the markers placed on the canvas when you add a trace.
As an example, let’s create a histogram of fixed acidity.
By default, the bars are blue, but what if you don’t like blue, or if you want to match the theme of your website or organization?
To change the color of all markers placed in a trace, such as all of the bars in this histogram, we add the color argument to the trace.
For example, to create a red histogram we add the argument color equals I(“red”).
winequality %>%
plot_ly(x = ~fixed_acidity) %>%
add_histogram()
Here we use the as is function, I(), to set the color of the histogram.
winequality %>%
plot_ly(x = ~fixed_acidity) %>%
add_histogram(color = I("red"))
Without this function, plotly assumes that you are mapping a variable to the color.
Another way to customize your graphics is to adjust the opacity of the markers.
For example, on this scatterplot of \(fixed\_acidity\) against \(residual\_sugar\) many points are overlapping or close together, making the data appear as a blob on the left side.
This is an issue called overplotting.
One approach to overcome overplotting is to increase the transparency of the points, or said another way, to decrease the opacity of the points.
winequality %>%
plot_ly(x = ~residual_sugar, y = ~fixed_acidity) %>%
add_markers()
Here is the scatterplot where points are only \(20\%\) opaque; that is, they are \(80\%\) transparent.
This effect allows us to see the density of points on the scatterplot: high-density regions appear darker than low-density regions.
Only one change is made to the code: we pass a list with opacity equals 0-point-2 to the marker argument to add underscore markers.
winequality %>%
plot_ly(x = ~residual_sugar, y = ~fixed_acidity) %>%
add_markers(marker = list(opacity = 0.2))
Another way to address the issue of overplotting is to change the plotting symbol from a filled glyph to an open glyph.
For example, an open circle is drawn for each point on this version of the scatterplot.
winequality %>%
plot_ly(x = ~residual_sugar, y = ~fixed_acidity) %>%
add_markers(marker = list(symbol = "circle-open"))
To change the plotting symbol to an open circle, we pass a list with element symbol equals “circle-open” to the marker argument of the trace.
Notice that we certainly get more of a sense of point density than the original version, but adjusting the opacity is probably preferred.
There are other marker options that we haven’t discussed here, but they are intuitive to add.
For example, we can change the size of the points on our scatterplot by adding a size argument to the list.
Or we can specify the width of the lines dividing bars on histograms and bar charts.
winequality %>%
plot_ly(x = ~residual_sugar, y = ~fixed_acidity) %>%
add_markers(marker = list(symbol = "diamond", size = 4))
You learned how to style your traces.
Now, you’ll learn how to use color to thoughtfully represent the values of a variable in your dataset.
To begin, consider this scatterplot of alcohol against flavonoids in wine.
A scatterplot with a smoother reveals a possibly nonlinear relationship between the variables.
However, the plot doesn’t consider how this relationship might be explained by a third variable, such as the wine type.
Once we add color representing wine type to this scatterplot, the nonlinear relationship is explained.
wine %>%
plot_ly(x = ~Flavanoids, y = ~Alcohol, color = ~Type) %>%
add_markers()
The overall trend appears nonlinear since the relationship between alcohol and flavanoids differs by wine type.
As always, we begin by piping in the data and specifying our aesthetic mappings.
To add color to the plot we specified that Type should be mapped to color, in addition to mapping Flavanoids and Alcohol to the x- and y-axes.
Then, we add the markers trace to create the scatterplot.
In the previous example, we used color to represent a factor, but color can also be used to add a third numeric variable to a scatterplot.
For example, here we use color to represent the color intensity of the wine.
wine %>%
plot_ly(x = ~Flavanoids, y = ~Alcohol, color = ~Color) %>%
add_markers()
It appears that wines with higher alcohol content tend to have higher color intensities.
Notice that plotly is using a color gradient to represent this numeric variable.
We again only need to add an aesthetic mapping for color and plotly determines the method to display the variable as well as the default color palette.
This is important to keep in mind in case factors have been recorded and loaded as numbers.
In this situation, you may wish to convert the variable to a factor so that discrete colors are chosen.
The default color palettes chosen by plotly are often sensible, but there are times that you may wish or need to change them.
For example, you may wish to make your graphic more colorblind safe or match the theme of the website where you plan to host the graphic.
Let’s change the palette of this scatterplot coded by wine type.
wine %>%
plot_ly(x = ~Flavanoids, y = ~Alcohol, color = ~Type) %>%
add_markers(colors = "Dark2")
To change the colors used in the scatterplot, we add the colors argument to the markers trace.
There are numerous palettes available in plotly, such as those provided in the RColorBrewer package.
Here, we specify colors equal “Dark2” to change the palette.
If you are trying to style your graph according to a specific theme, it’s unlikely that the palette is already defined in the RColorBrewer package.
Luckily, the colors argument accepts vectors of valid R color code.
For example, we can pass colors the character vector “orange”, “black”, “skyblue” to change the palette.
wine %>%
plot_ly(x = ~Flavanoids, y = ~Alcohol, color = ~Type) %>%
add_markers(colors = c("orange", "black", "skyblue"))
In addition to named colors, you can also use numerous formats including rgb and hex to define your palette.
At this point, it’s important to note the difference in syntax between changing a palette and setting the value of all points using the as is function.
When you change a palette, you have already mapped a variable, such as wine type, to the color aesthetic; thus, the color argument in add_markers() knows that the vector of colors is referring to the palette and not the individual observations.
Now that you know how to customize the appearance of your traces, you’ll learn how to customize a key interactive tool: the hover information. By default, plotly will add hover information to your chart. Let’s see what information appears by default.
First, let’s consider the scatterplot of alcohol content against flavanoids.
wine %>%
plot_ly(x = ~Flavanoids, y = ~Alcohol, hoverinfo = "text", text = ~paste("Flavanoids:",
Flavanoids, "<br>", "Alcohol:", Alcohol)) %>%
add_markers()
As we hover our mouse above specific points the coordinates appear; however, they are displayed only as coordinate pairs, without variable names.
Next, let’s consider a bar chart of wine type.
wine %>%
count(Type) %>%
plot_ly(x = ~Type, y = ~n, hoverinfo = "y") %>%
add_bars()
The hover info again reveals the coordinates of the top of each bar but lacks variable names.
As another example, consider boxplots of alcohol content by wine quality.
Here, the hover information gives the category name, high, medium, or low, as well as summary statistics, including the minimum, first quartile, median, third quartile, and maximum.
By now you see the pattern. By default, plotly shows the coordinates in the hover info, but the default settings can be improved.
In some situations, a simple change to the hover info, such as only displaying the y coordinate may make your chart more intuitive.
winequality %>%
plot_ly(x = ~free_so2, y = ~total_so2) %>%
add_markers(marker = list(opacity = 0.2))
For example, on a bar chart, each bar is clearly labeled with the level of a factor.
Let’s take a look at the code. The only change necessary is to add the hoverinfo argument to \(plot\_ly()\).
The hoverinfo argument allows you to choose what information is displayed.
By default hoverinfo is set to “all”, displaying information about all elements of a trace.
To restrict the amount of information, you can set hoverinfo to “x” or “y”, or you can remove it altogether by setting hoverinfo to “none”.
For multivariate displays “\(x + y\)” and “\(x + y + z\)” are additional options.
Let’s return to the scatterplot of alcohol content against flavanoids.
winequality %>%
plot_ly(x = ~free_so2, y = ~total_so2) %>%
add_markers(marker = list(opacity = 0.2)) %>%
layout(xaxis = list(title = "Free SO2 (ppm)"), yaxis = list(title = "Total SO2 (ppm)"))
Here, we want to customize the hover information to include the variable names, rather than simply displaying an ordered pair.
To customize the hover info there are two key arguments that are passed to the \(plot\_ly()\) function: hoverinfo and text.
winequality %>%
plot_ly(x = ~free_so2, y = ~total_so2) %>%
add_markers(marker = list(opacity = 0.2)) %>%
layout(title = "Does free SO2 predict total SO2 in wine?", xaxis = list(title = "Free SO2 (ppm, log scale)",
type = "log"), yaxis = list(title = "Total SO2 (ppm, log scale)", type = "log"))
We set hoverinfo equal to “text” to indicate that we will be manually defining the information.
The text argument expects a character string, so we use the paste() function to string together the information we want to display.
winequality %>%
plot_ly(x = ~free_so2, y = ~total_so2) %>%
add_markers(marker = list(opacity = 0.5)) %>%
layout(xaxis = list(title = "Free SO2 (ppm)", zeroline = FALSE), yaxis = list(title = "Total SO2 (ppm)",
zeroline = FALSE, showgrid = FALSE))
Here, we type polished variable labels as character strings and use the variable without quotes to specify the value.
To start a new line use the HTML line-break, br, wrapped in angled brackets.
winequality %>%
plot_ly(x = ~free_so2, y = ~total_so2) %>%
add_markers(marker = list(opacity = 0.5)) %>%
layout(xaxis = list(title = "Free SO2 (ppm)"), yaxis = list(title = "Total SO2 (ppm)"),
plot_bgcolor = toRGB("gray90"), paper_bgcolor = toRGB("skyblue"))
Finally, notice that we use the tilde operator to specify the hover info text.
Remember that the tilde operator is used to map columns to aesthetic parameters.
Here, we map the variables Flavanoids and Alcohol to the hover info. Without this tilde, plotly would return an error, since these objects are not in the global environment, but are stored as columns in our dataset.
Previously, you have focused on creating a plotly chart based on a single trace.
Now, you will learn how to layer traces to create more complex charts.
Specifically, you’ll learn how to add smoothers to scatterplots and overlay density plots.
While complex charts often look impressive, it’s important to remember that the simplest chart that conveys your message is usually the best choice. So layer cautiously.
As a first example, consider adding a LOESS smoother to the scatterplot of alcohol content against flavonoids to highlight the nonlinear structure.
We begin by fitting the LOESS model and storing it as the object m.
m <- loess(Alcohol ~ Flavanoids, data = wine, span = 1.5)
wine %>%
plot_ly(x = ~Flavanoids, y = ~Alcohol) %>%
add_markers() %>%
add_lines(y = ~fitted(m)) %>%
layout(showlegend = FALSE)
Next, we construct a scatterplot with Flavonoids on the x-axis and Alcohol on the y-axis.
To add another layer, we pipe the plot into another trace layer.
Here, we wish to add a line for representing the fitted model, so we add the lines trace.
m2 <- lm(Alcohol ~ poly(Flavanoids, 2), data = wine)
wine %>%
plot_ly(x = ~Flavanoids, y = ~Alcohol) %>%
add_markers(showlegend = FALSE) %>%
add_lines(y = ~fitted(m))
To plot the fitted values we need to map Flavonoids to the x-axis and the fitted values to the y-axis.
Notice here that we only need to explicitly map fitted(m) to the y aesthetic because the x aesthetic will be inherited from the base layer.
Finally, we specify showlegend equals FALSE to remove the legend.
m2 <- lm(Alcohol ~ poly(Flavanoids, 2), data = wine)
wine %>%
plot_ly(x = ~Flavanoids, y = ~Alcohol) %>%
add_markers(showlegend = FALSE) %>%
add_lines(y = ~fitted(m)) %>%
layout(showlegend = FALSE)
If we retained the legend, then each trace would be represented, and it seems silly to have a legend differentiating the observed points from the fitted values.
To compare two models we can add multiple layers to the chart. For example, we may wish to compare how a second-order polynomial fits the data compared to the LOESS fit.
m2 <- lm(Alcohol ~ poly(Flavanoids, 2), data = wine)
wine %>%
plot_ly(x = ~Flavanoids, y = ~Alcohol) %>%
add_markers(showlegend = FALSE) %>%
add_lines(y = ~fitted(m), name = "LOESS") %>%
add_lines(y = ~fitted(m2), name = "Polynomial")
We begin by fitting the polynomial model using the lm() function and storing it as the object m2.
The LOESS model is still stored in the object m. The code to create the chart is nearly identical to the previous slide, but we now have three layers with different traces: we add the points using \(add\_markers()\), the LOESS fit using the first \(add\_lines()\) trace, and the polynomial fit using the second \(add\_lines()\) trace.
The key differences in the code relate to the creation of the legend, which is necessary to differentiate the two models.
We can name each trace by adding the name argument. Here, we have added name equals “LOESS” and name equals “Polynomial”.
In the markers trace, we specify showlegend equals FALSE to remove this trace from the legend since it is obvious that the points represent observed data.
Layering is also a useful tool when comparing distributions.
For example, here we layer density plots to compare the distribution of Flavanoids between wine types.
d1 <- filter(wine, Type == 1) d2 <- filter(wine, Type == 2) d3 <- filter(wine, Type == 3) density1 <- density(d1$Flavanoids) density2 <- density(d2$Flavanoids) density3 <- density(d3$Flavanoids)
If you’re not familiar with density plots, you can think of them as smoothed histograms.
Layering density plots requires three steps:
First, we create subsets for each wine type using filter().
Next, we use the density() command to calculate the x, y coordinates needed for the smoothers.
We then add three layers to our chart using \(add\_lines()\), mapping x and y to the appropriate aesthetics, and setting the name of the trace to the wine type.
plot_ly(opacity = 0.5) %>%
add_lines(x = ~density1$x, y = ~density1$y, name = "Type 1") %>%
add_lines(x = ~density2$x, y = ~density2$y, name = "Type 2") %>%
add_lines(x = ~density3$x, y = ~density3$y, name = "Type 3") %>%
layout(xaxis = list(title = "Flavonoids"), yaxis = list(title = "Density"))
We also polish the axis labels.
Creating a series of subplots is another powerful way to explore the impact of additional variables.
We’ll use subplots to explore the vgsales2016 dataset, which contains information about video games released in 2016, including sales and ratings.
vgsales2016 <- read.csv("https://raw.githubusercontent.com/endri81/DataVisualization/refs/heads/main/data/vgsales2016.csv",
sep = ",")
glimpse(vgsales2016)
Rows: 16,450 Columns: 16 $ Name <chr> "Wii Sports", "Super Mario Bros.", "Mario Kart Wii", "Wii… $ Platform <chr> "Wii", "NES", "Wii", "Wii", "GB", "GB", "DS", "Wii", "Wii… $ Year <int> 2006, 1985, 2008, 2009, 1996, 1989, 2006, 2006, 2009, 198… $ Genre <chr> "Sports", "Platform", "Racing", "Sports", "Role-Playing",… $ Publisher <chr> "Nintendo", "Nintendo", "Nintendo", "Nintendo", "Nintendo… $ NA_Sales <dbl> 41.36, 29.08, 15.68, 15.61, 11.27, 23.20, 11.28, 13.96, 1… $ EU_Sales <dbl> 28.96, 3.58, 12.76, 10.93, 8.89, 2.26, 9.14, 9.18, 6.94, … $ JP_Sales <dbl> 3.77, 6.81, 3.79, 3.28, 10.22, 4.22, 6.50, 2.93, 4.70, 0.… $ Other_Sales <dbl> 8.45, 0.77, 3.29, 2.95, 1.00, 0.58, 2.88, 2.84, 2.24, 0.4… $ Global_Sales <dbl> 82.53, 40.24, 35.52, 32.77, 31.37, 30.26, 29.80, 28.92, 2… $ Critic_Score <int> 76, NA, 82, 80, NA, NA, 89, 58, 87, NA, NA, 91, NA, 80, 6… $ Critic_Count <int> 51, NA, 73, 73, NA, NA, 65, 41, 80, NA, NA, 64, NA, 63, 4… $ User_Score <chr> "8", NA, "8.3", "8", NA, NA, "8.5", "6.6", "8.4", NA, NA,… $ User_Count <int> 322, NA, 709, 192, NA, NA, 431, 129, 594, NA, NA, 464, NA… $ Developer <chr> "Nintendo", NA, "Nintendo", "Nintendo", NA, NA, "Nintendo… $ Rating <chr> "E", NA, "E", "E", NA, NA, "E", "E", "E", NA, NA, "E", NA…
You created a scatterplot exploring the relationship between user and critic scores, where color represented genre.
vgsales2016 %>%
plot_ly(x = ~Critic_Score, y = ~User_Score, color = ~Genre) %>%
add_markers()
There are 11 video game genres in 2016, making it difficult to choose a color palette that makes visual comparison easy.
In fact, if you run this code in your console, you’ll receive a warning message that the default palette only has 8 colors.
To overcome this difficulty we can create multiple graphs of the dataset — one for each category — arranged in a series.
These are often called small multiples, trellis graphics, faceted graphics, or subplots.
To understand how subplots are constructed, let’s create a plot for only the action genre.
First, we extract the rows corresponding to action games using \(filter(Genre == "Action")\).
action_df <- vgsales2016 %>%
filter(Genre == "Action")
This new data frame, action_df, only contains 178 observations.
Next, we pipe \(action\_df\) into our scatterplot code to plot the 178 observations.
action_df %>%
plot_ly(x = ~Critic_Score, y = ~User_Score) %>%
add_markers()
Now, let’s consider how to create a series of subplots.
When there are a small number of subplots this can be done manually by storing each plot as an object and combining them using the subplot() function.
For example, we can store the action scatterplot we just created as p1 and create another scatterplot for adventure games, storing this plot as p2.
# Two Subplots
p1 <- action_df %>%
plot_ly(x = ~Critic_Score, y = ~User_Score) %>%
add_markers()
p2 <- vgsales2016 %>%
filter(Genre == "Adventure") %>%
plot_ly(x = ~Critic_Score, y = ~User_Score) %>%
add_markers()
subplot(p1, p2, nrows = 1)
Finally, we use the subplot function to combine p1 and p2 into a grid of plots. Here we specify \(nrows = 1\) to produce a single row of plots.
Notice that when we create subplots, the default legend simply gives numeric labels to the traces, which is extremely uninformative.
To add an informative legend to the subplot, we map genre to the name of the trace.
p1 <- vgsales2016 %>%
plot_ly(x = ~Critic_Score, y = ~User_Score, color = ~Genre) %>%
add_markers()
p2 <- vgsales2016 %>%
filter(Genre == "Adventure") %>%
plot_ly(x = ~Critic_Score, y = ~User_Score) %>%
add_markers(name = ~Genre)
subplot(p1, p2, nrows = 1)
Now we have Action and Adventure in the legend, providing far more information.
Now that we have an informative legend, it’s time to add informative axis labels.
subplot(p1, p2, nrows = 1, shareY = TRUE, shareX = TRUE)
If the subplots share the same x- and y-axes, as they do in faceted plots, then adding \(shareX = TRUE\) and \(shareY = TRUE\) to the subplot() command will add axis labels for the variable names.
Additionally, sharing an axis allows interactivity to be linked.
For example, if we zoom in on the y-axis on the left plot, the y-axis will be restricted on both plots.
If this isn’t the desired behavior, then you use the titleX and titleY arguments to specify axis titles instead of setting shareX and shareY to TRUE.
Manually creating facets is tedious and, as with all copy-and-paste solutions, is error-prone.
A better approach is to automate this split and plot procedure.
Let’s see how to create this faceted scatterplot of user score against critic score for all genres using tidyverse tools.
To begin, we load the tidyverse because we need tools from the dplyr, tidyr, and purrr packages.
Then, we pipe the data set into group_by() to create subsets for each region and nest() the results.
This produces a tibble with a column for Genre and a column called data that houses the remaining data for each genre.
Next, we add a plot column to store the plotly objects for each Genre using mutate.
To create a plotly object for each Genre, we use the map2() function from the purrr package. Map2 allows us to iterate over two arguments.
In our example it iterates over the elements of the Genre and data columns in the tibble.
vgsales2016 %>%
group_by(Genre) %>%
nest() %>%
mutate(plot = map2(data, Genre, \(data, Genre) plot_ly(data = data, x = ~Critic_Score,
y = ~User_Score) %>%
add_markers(name = ~Genre)))
# A tibble: 13 × 3 # Groups: Genre [13] Genre data plot <chr> <list> <list> 1 Sports <tibble [2,306 × 15]> <plotly> 2 Platform <tibble [878 × 15]> <plotly> 3 Racing <tibble [1,226 × 15]> <plotly> 4 Role-Playing <tibble [1,483 × 15]> <plotly> 5 Puzzle <tibble [569 × 15]> <plotly> 6 Misc <tibble [1,721 × 15]> <plotly> 7 Shooter <tibble [1,296 × 15]> <plotly> 8 Simulation <tibble [858 × 15]> <plotly> 9 Action <tibble [3,308 × 15]> <plotly> 10 Fighting <tibble [837 × 15]> <plotly> 11 Adventure <tibble [1,293 × 15]> <plotly> 12 Strategy <tibble [673 × 15]> <plotly> 13 <NA> <tibble [2 × 15]> <plotly>
For each genre-data pair, it then creates a plotly object.
vgsales2016 %>%
group_by(Genre) %>%
nest() %>%
mutate(plot = map2(data, Genre, \(data, Genre) plot_ly(data = data, x = ~Critic_Score,
y = ~User_Score) %>%
add_markers(name = ~Genre))) %>%
subplot(nrows = 2)
To create this plotly object, we write an anonymous function that takes genre and data as inputs.
As of R 4.1, we can write such an anonymous function in shorthand which you see here. (data, Genre) begins the definition of the anonymous function and we then use our typical plotly code as the body of the function.
To be more verbose you can write the word function in place of the slash, but I show the slash usage here to align with the help files.
wine %>%
plot_ly() %>%
add_trace(type = "splom", dimensions = list(list(label = "Alcohol", values = ~Alcohol),
list(label = "Flavonoids", values = ~Flavanoids), list(label = "Color", values = ~Color)))
Finally, we pipe the results into subplot() to produce the graphic, specifying nrows = 2 to produce a grid of subplots with two rows.